\(\newcommand{\E}{\mathbb{E}} \newcommand{\Var}{\mathrm{Var}}\) \(\newcommand{\L}{\mathscr{L}} \newcommand{\LL}{\mathscr{l}}\)
Theoretical outlook for single-shot Prisoner’s Dilemma game - standard equilibrium predicts uncooperative \(P,P\) outcome, Pareto-dominated by cooperative outcome.
Theoretical outlook for repeated play with a known last round also predicts uncooperative play. Maintaining cooperation requires a credible threat of retribution/reciprocity, and if last round is known, then backward induction essentially results in an Always Defect strategy.
Some key findings:
Normalizing parameters in Fig. 1b: one-shot gain from defection \(g\) (compared to cooperative outcome) and one-shot loss \(\LL\) from being defected on (compared to non-cooperative outcome).
\[g = (T - P) / (R - P) - 1 > 0\] \[\LL = -(S - P) / (R - P) > 0\]
Cooperation rates in the studies included in the metastudy tended to focus on:
Data from the included studies were revisited using a uniform methodology, referred to as standard perspective; these cooperation rate outcome variables were calculated for all available data. In Table 1, note organization from short to long time horizon \(H\) and high to low gain parameter \(g\) within each \(H\) which track well with variation in cooperation rates.
Unpacking “folk wisdom” vs. proposed understanding of effect of horizon on cooperation: Average cooperation and round to first defection increase with horizon \(H\), consistent with “difficulty” of backward induction idea. But increasing horizon also increases difference in value between joint cooperation and joint defection; more rounds means higher increase in payoff from joint cooperation, though risk remains the same (one round of “sucker” status regardless of horizon).
Defining Basin of Attraction toward “Always Defect” strategy, relative to “Grim Trigger” strategy and based on infinitely repeated PD experiments:
\[sizeBAD = \frac{\LL}{(H - 1) + \LL - g}\] \(sizeBAD \in (0, 1)\) is increasing in \(g\) and \(\LL\) and decreasing in \(H\). This represents the probability a player assigns to the counterpart of playing “grim” so that the player is indifferent between playing grim and AD.
As \(sizeBAD \rightarrow 1\) cooperation seems less likely (the player has to have very high belief in the counterpart’s probability of cooperating).
Two stage-game payoffs \(\times\) two horizons, played for 30 supergames:
Three sessions for each treatment, though each subject experienced one set of treatment parameters (payoffs and horizon).
Payment at conclusion based on the earnings during the entire 30-supergame session.
In long-horizon treatments, cooperation in initial rounds increases with experience - cooperation in last five supergames higher than that in first five (not consistent in short-horizon treatments).
Experience leads to lower cooperation in later rounds, for all treatments - cooperation in last five supergames lower than that in first five.
Stage-game parameters significantly affect initial cooperation and how it changes with experience.
Cooperation rates for D8 and E4 are similar - suggesting that horizon effect on initial cooperation rates is largely captured within the \(sizeBAD\) combined parameter (value of cooperation). This does not support the “folk wisdom” that cooperation rates for longer horizons stem from difficulty of backward induction calculations.
## The Breakdown of Cooperation {.tabset}
Unraveling of cooperation across supergames: behavior at the end of a supergame moves slowly in the direction suggested by backward induction. Threshold strategy appears to evolve to have an earlier and earlier threshold.
Threshold strategy involves cooperation until one player defects; then defection for both players thereafter. Incidence of cooperation after a defection is inconsistent with a threshold strategy, therefore a threshold strategy should show no gap between first defection and last cooperation.
Fig. 07 shows use of threshold strategies becomes more consistent with experience.
(Left) Conditional on cooperation in first round, mean round of first defection declines with experience. (Right) Round of first defection for early, mid, and late supergames - cooperation breakdown shifts earlier by one round for every ten supergames (though probability of breakdown at start of game decreases)
Decreasing cooperation rates start in last round, and gradually shift toward earlier rounds with experience.
Consistency with threshold strategy for other treatments.
\[\beta_{it + 1} = \theta_i \beta_{it} + L_{it}\] where \(\beta^k_{it}\) is weight that subj \(i\) places on strategy \(k\) to be adopted by opponent in supergame \(t\). \(\theta_i\) is discounting of past nbeliefs, \(L_{it}\) is update vector given play in supergame \(t\). \(L^k_{it}\) takes value 1 when a unique strategy most consistent with opponent’s play within a supergame; for all other strategies, update vector takes value 0.
\[\vec{\mu}_{it} = \vec{u}_{it} + \lambda_i \vec{\epsilon}_{it}\] where \(\vec{u}_{it} = \vec{U}\beta_{it}\) and \(\vec{U}\) is a square matrix representing payoff comparing each strategy against every other strategy (function of \(H\) and stage-game payoffs). \(\lambda_i\) is scaling parameter measuring how well subject responds to beliefs, \(\epsilon_{it}\) is vector of idiosyncratic error terms.
Probability of choosing strategy \(k\):
\[p^k_{it} = \frac{\exp (\frac{u^k_{it}}{\lambda_i})}{\sum_k \exp(\frac{u^k_{it}}{\lambda_i})}\]
For each subject, they estimated (using MLE) \(\beta_{i0}, \lambda_i, \sigma_i\) (initial beliefs, noise in strategy, noise in action choice implementation), and \(\theta_i, \kappa_i\) (how beliefs are updated, and how execution noise changes over time).
Which factors contribute to the sustained cooperation seen in long-run behavior in figure XI?
What if a small fraction of cooperative subjects sustains long-term cooperation?
Key Findings: